NTCIR-4 Patent Retrieval Experiments at RICOH

نویسنده

  • Hideo Itoh
چکیده

Focusing on the document structure of patent, two search databases AC and WH were built and compared in retrieval performance, where AC includes only abstract and claim sections in patent and WH includes the whole patent texts. Moreover, we attempted to combine search results for the two databases to improve retrieval performance. Another point of our experiments is cross-lingual patent retrieval using a large and high-quality parallel corpus. The query submitted against the parallel database was partially translated and expanded using the same mechanism for pseudo-relevance feedback.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RICOH at NTCIR-2

At NTCIR-2, RICOH submitted eight runs for the Japanese IR task. Of the eight runs, four runs use the title eld only and the other four use the description eld only. RICOH's system is built on our English text retrieval system and augmented to handle Japanese text. The system features (1) hybrid retrieval using a combination of n-gram indexing and wordbased document ranking; (2) word-based and ...

متن کامل

University of Tokyo/RICOH at NTCIR-3 Web Retrieval Task

In NTCIR Web Task we introduced new approaches in similarity retrieval using one known relevant document and pseudo relevance feedback and topic and target retrieval incorpo rating link analysis The experiments showed that both approaches were promising

متن کامل

NTCIR-5 Patent Retrieval Experiments at Hitachi

In NTCIR-5, we used five retrieval methods proposed in NTCIR-4: (1) query term weighting using only document frequency, (2) stopword deletion, (3) two-stage patent retrieval, (4) term weighting considering “measurement terms”, and (5) related term expansion. In this paper, we compare the retrieval accuracy for two test sets: 34 main queries in NTCIR-4 and 1189 new queries in NTCIR-5. Then, we e...

متن کامل

Ricoh in the NTCIR4 CLIR Tasks

This paper describes Ricoh’s participation in the NTCIR-4 CLIR tasks. We used the same approach as we took at the NTCIR-3 IR tasks for Japanese. We applied our system using a Traditional/Simplified Chinese converter and n-gram indexing for the Chinese IR task. The results show that our simple approach for Chinese IR can provide information retrieval for both Traditional and Simplified Chinese.

متن کامل

NTCIR-4 PATENT Experiments at Osaka Kyoiku University - Gram-Based Passage Index and Essential Words

Long gram-based indices are experimented at NTCIR-4 patent task. No morphological analyses are required to make gram-based indices. The ABJ and DEJ tag fields are extracted and indexed from NTCIR-4 patent corpus. Passages are extracted and indexed also. The total index size is 240Gbyte and time to make indices is about 86 hours. By merging the result of passage retrieval with the result of docu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004